Model Selection

Academic Document Processing

# Academic Document Processing

A VisionEncoderDecoder model for generating LaTeX formulas from images, utilizing Swin Transformer encoder and GPT-2 decoder architecture

A baseline model based on VisionEncoderDecoderModel, fine-tuned on datasets for generating LaTeX formulas from images.

An OCR model specialized in recognizing Chinese-English mixed LaTeX formulas, supporting local offline CPU inference

Pix2Text's Mathematical Formula Detection (MFD) model for recognizing mathematical formulas in images

Text Recognition Other

Cephalo LaTeX Phi 3 Vision 128k 4b Beta

Cephalo is a series of vision-language large models focused on multimodal materials science. The current version specializes in converting mathematical formula images into LaTeX code.

Nougat For Formula

A fine-tuned mathematical formula recognition model based on Nougat-small, excelling in extracting LaTeX formula code from images

Nougat is a vision-language model based on the Donut architecture, specifically designed for converting scientific PDFs into Markdown format.

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase